feat: add Typecast TTS provider#7436
feat: add Typecast TTS provider#7436jaebong-human wants to merge 1 commit intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The repeated try/except blocks for parsing numeric config values (emotion_intensity, volume, pitch, tempo, timeout) could be consolidated into a small helper to reduce duplication and keep the constructor easier to read.
- Consider using pathlib.Path for building and handling the temporary audio file path instead of os.path to stay consistent with modern path handling and make the code more test-friendly.
- When checking the response content type, normalizing or lowercasing the header value (and potentially using startswith('audio/') on a lowercased string) would make the check more robust against case or parameter variations.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The repeated try/except blocks for parsing numeric config values (emotion_intensity, volume, pitch, tempo, timeout) could be consolidated into a small helper to reduce duplication and keep the constructor easier to read.
- Consider using pathlib.Path for building and handling the temporary audio file path instead of os.path to stay consistent with modern path handling and make the code more test-friendly.
- When checking the response content type, normalizing or lowercasing the header value (and potentially using startswith('audio/') on a lowercased string) would make the check more robust against case or parameter variations.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/typecast_tts_source.py" line_range="94-96" />
<code_context>
+ }
+ body = self._build_request_body(text)
+
+ async with AsyncClient(
+ timeout=self.timeout,
+ proxy=self.proxy if self.proxy else None,
+ ).stream(
+ "POST",
</code_context>
<issue_to_address>
**issue (bug_risk):** The `proxy` argument to `AsyncClient` is likely invalid; httpx expects `proxies` instead.
`AsyncClient` only accepts `proxies`, not `proxy`, so this will raise a `TypeError` when a proxy is configured and break all requests in that case. Please update this to use `proxies={"all": self.proxy}` (or the appropriate mapping for your use case).
</issue_to_address>
### Comment 2
<location path="astrbot/core/provider/sources/typecast_tts_source.py" line_range="36-57" />
<code_context>
+ self.emotion_preset: str = provider_config.get(
+ "typecast-emotion-preset", "normal"
+ )
+ try:
+ self.emotion_intensity: float = float(
+ provider_config.get("typecast-emotion-intensity", 1.0)
+ )
+ except ValueError:
+ self.emotion_intensity = 1.0
+ try:
+ self.volume: int = int(provider_config.get("typecast-volume", 100))
+ except ValueError:
+ self.volume = 100
+ try:
+ self.pitch: int = int(provider_config.get("typecast-pitch", 0))
+ except ValueError:
+ self.pitch = 0
+ try:
+ self.tempo: float = float(provider_config.get("typecast-tempo", 1.0))
+ except ValueError:
+ self.tempo = 1.0
+ try:
+ self.timeout: int = int(provider_config.get("timeout", 30))
+ except ValueError:
+ self.timeout = 30
</code_context>
<issue_to_address>
**suggestion:** Timeout parsing only catches `ValueError`, missing potential `TypeError` from unexpected config types.
If `provider_config.get(...)` returns `None` or another non-string/number, `int(...)` / `float(...)` will raise `TypeError`, not `ValueError`. For these numeric fields, consider catching `(ValueError, TypeError)` or using a small helper to coerce values with a default safely.
```suggestion
try:
self.emotion_intensity: float = float(
provider_config.get("typecast-emotion-intensity", 1.0)
)
except (TypeError, ValueError):
self.emotion_intensity = 1.0
try:
self.volume: int = int(provider_config.get("typecast-volume", 100))
except (TypeError, ValueError):
self.volume = 100
try:
self.pitch: int = int(provider_config.get("typecast-pitch", 0))
except (TypeError, ValueError):
self.pitch = 0
try:
self.tempo: float = float(provider_config.get("typecast-tempo", 1.0))
except (TypeError, ValueError):
self.tempo = 1.0
try:
self.timeout: int = int(provider_config.get("timeout", 30))
except (TypeError, ValueError):
self.timeout = 30
```
</issue_to_address>
### Comment 3
<location path="tests/test_typecast_tts_source.py" line_range="146-167" />
<code_context>
+ assert body["output"]["volume"] == 100
+
+
+def test_provider_config_defaults():
+ """Default config values are applied correctly."""
+ provider = ProviderTypecastTTS(
+ provider_config={
+ "id": "test-typecast",
+ "type": "typecast_tts",
+ "api_key": "test-api-key",
+ "typecast-voice-id": "tc_60e5426de8b95f1d3000d7b5",
+ },
+ provider_settings={},
+ )
+ assert provider.voice_id == "tc_60e5426de8b95f1d3000d7b5"
+ assert provider.model_name == "ssfm-v30"
+ assert provider.language == "kor"
+ assert provider.emotion_preset == "normal"
+ assert provider.emotion_intensity == 1.0
+ assert provider.volume == 100
+ assert provider.pitch == 0
+ assert provider.tempo == 1.0
+ assert provider.timeout == 30
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for invalid numeric config values to ensure fallback defaults are actually used
Right now `test_provider_config_defaults` only exercises the happy path. Please add a test (e.g. `test_provider_config_invalid_numbers_use_defaults`) that sets `typecast-emotion-intensity`, `typecast-volume`, `typecast-pitch`, `typecast-tempo`, and `timeout` to non-numeric strings and asserts that each is reset to its default value via the `try/except ValueError` handling.
```suggestion
def test_provider_config_defaults():
"""Default config values are applied correctly."""
provider = ProviderTypecastTTS(
provider_config={
"id": "test-typecast",
"type": "typecast_tts",
"api_key": "test-api-key",
"typecast-voice-id": "tc_60e5426de8b95f1d3000d7b5",
},
provider_settings={},
)
assert provider.voice_id == "tc_60e5426de8b95f1d3000d7b5"
assert provider.model_name == "ssfm-v30"
assert provider.language == "kor"
assert provider.emotion_preset == "normal"
assert provider.emotion_intensity == 1.0
assert provider.volume == 100
assert provider.pitch == 0
assert provider.tempo == 1.0
assert provider.timeout == 30
def test_provider_config_invalid_numbers_use_defaults():
"""Invalid numeric config values fall back to defaults via ValueError handling."""
provider = ProviderTypecastTTS(
provider_config={
"id": "test-typecast",
"type": "typecast_tts",
"api_key": "test-api-key",
"typecast-voice-id": "tc_60e5426de8b95f1d3000d7b5",
"typecast-emotion-intensity": "not-a-number",
"typecast-volume": "not-a-number",
"typecast-pitch": "not-a-number",
"typecast-tempo": "not-a-number",
"timeout": "not-a-number",
},
provider_settings={},
)
# All invalid numeric configs should fall back to their default values
assert provider.emotion_intensity == 1.0
assert provider.volume == 100
assert provider.pitch == 0
assert provider.tempo == 1.0
assert provider.timeout == 30
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces a new Typecast TTS provider, ProviderTypecastTTS, for text-to-speech functionality, along with comprehensive unit tests. The provider handles various configuration parameters like API key, voice ID, emotion, volume, pitch, tempo, and proxy settings. It implements an asynchronous method to fetch audio from the Typecast API, stream it to a temporary WAV file, and includes error handling for API failures. Feedback from the review highlights several areas for improvement: addressing a potential resource leak by properly closing httpx.AsyncClient using async with for both the client and the stream, mitigating blocking I/O operations by using asynchronous file handling or asyncio.to_thread for file writes, refactoring repetitive configuration parsing logic in the __init__ method for better maintainability, and ensuring the temporary directory exists before writing files to it.
| async with AsyncClient( | ||
| timeout=self.timeout, | ||
| proxy=self.proxy if self.proxy else None, | ||
| ).stream( | ||
| "POST", | ||
| self.API_URL, | ||
| headers=headers, | ||
| json=body, | ||
| ) as response: |
There was a problem hiding this comment.
The AsyncClient is instantiated but never closed, which will lead to resource leaks (unclosed connections). In httpx, the client should be used as a context manager or closed explicitly. Using the async with ... as client, client.stream(...) syntax fixes this while preserving the current indentation.
| async with AsyncClient( | |
| timeout=self.timeout, | |
| proxy=self.proxy if self.proxy else None, | |
| ).stream( | |
| "POST", | |
| self.API_URL, | |
| headers=headers, | |
| json=body, | |
| ) as response: | |
| async with AsyncClient( | |
| timeout=self.timeout, | |
| proxy=self.proxy if self.proxy else None, | |
| ) as client, client.stream( | |
| "POST", | |
| self.API_URL, | |
| headers=headers, | |
| json=body, | |
| ) as response: |
There was a problem hiding this comment.
Fixed — now using async with AsyncClient(...) as client, client.stream(...) as response pattern.
| with open(path, "wb") as f: | ||
| async for chunk in response.aiter_bytes(): | ||
| f.write(chunk) |
There was a problem hiding this comment.
Noted — this follows the same pattern as the existing TTS providers (e.g. fishaudio_tts_api_source.py, edge_tts_source.py). Will keep consistent for now.
| try: | ||
| self.emotion_intensity: float = float( | ||
| provider_config.get("typecast-emotion-intensity", 1.0) | ||
| ) | ||
| except ValueError: | ||
| self.emotion_intensity = 1.0 | ||
| try: | ||
| self.volume: int = int(provider_config.get("typecast-volume", 100)) | ||
| except ValueError: | ||
| self.volume = 100 | ||
| try: | ||
| self.pitch: int = int(provider_config.get("typecast-pitch", 0)) | ||
| except ValueError: | ||
| self.pitch = 0 | ||
| try: | ||
| self.tempo: float = float(provider_config.get("typecast-tempo", 1.0)) | ||
| except ValueError: | ||
| self.tempo = 1.0 | ||
| try: | ||
| self.timeout: int = int(provider_config.get("timeout", 30)) | ||
| except ValueError: | ||
| self.timeout = 30 |
There was a problem hiding this comment.
The configuration parsing logic is repetitive. Using a helper function can improve maintainability and readability.
def get_cfg(key, default, type_func):
try:
return type_func(provider_config.get(key, default))
except (ValueError, TypeError):
return default
self.emotion_intensity: float = get_cfg("typecast-emotion-intensity", 1.0, float)
self.volume: int = get_cfg("typecast-volume", 100, int)
self.pitch: int = get_cfg("typecast-pitch", 0, int)
self.tempo: float = get_cfg("typecast-tempo", 1.0, float)
self.timeout: int = get_cfg("timeout", 30, int)There was a problem hiding this comment.
Fixed — extracted _safe_cast() helper function.
| temp_dir = get_astrbot_temp_path() | ||
| path = os.path.join(temp_dir, f"typecast_tts_{uuid.uuid4()}.wav") |
There was a problem hiding this comment.
It is safer to ensure that the temporary directory exists before attempting to create a file within it, as get_astrbot_temp_path() might return a path that hasn't been created yet.
| temp_dir = get_astrbot_temp_path() | |
| path = os.path.join(temp_dir, f"typecast_tts_{uuid.uuid4()}.wav") | |
| temp_dir = get_astrbot_temp_path() | |
| os.makedirs(temp_dir, exist_ok=True) | |
| path = os.path.join(temp_dir, f"typecast_tts_{uuid.uuid4()}.wav") |
There was a problem hiding this comment.
Fixed — added os.makedirs(temp_dir, exist_ok=True).
fcfad5f to
18fad44
Compare
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
AsyncClientinitialization uses aproxykeyword, but httpx expectsproxies(or a mount/transport) for proxy configuration; as written the proxy setting is likely ignored, so consider switching to the correct argument or configuration pattern. - You may want to validate
typecast-emotion-presetagainst the limited set of supported values (normal/happy/sad/angry/whisper/toneup/tonedown) so that invalid configuration is caught early instead of being passed through to the API.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `AsyncClient` initialization uses a `proxy` keyword, but httpx expects `proxies` (or a mount/transport) for proxy configuration; as written the proxy setting is likely ignored, so consider switching to the correct argument or configuration pattern.
- You may want to validate `typecast-emotion-preset` against the limited set of supported values (normal/happy/sad/angry/whisper/toneup/tonedown) so that invalid configuration is caught early instead of being passed through to the API.
## Individual Comments
### Comment 1
<location path="tests/test_typecast_tts_source.py" line_range="29-38" />
<code_context>
+ )
+
+
+@pytest.mark.asyncio
+async def test_get_audio_empty_text():
+ """Empty text raises ValueError."""
+ provider = _make_provider()
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for whitespace-only text, not just an empty string.
Since `get_audio` checks `not text.strip()`, please add a case with whitespace-only input (e.g. `" "` or `"\n\t"`) to confirm it also raises the same `ValueError`. This helps catch regressions if the `.strip()` behavior changes.
Suggested implementation:
```python
@pytest.mark.asyncio
async def test_get_audio_whitespace_text():
"""Whitespace-only text raises ValueError."""
provider = _make_provider()
with pytest.raises(ValueError):
await provider.get_audio(" ")
@pytest.mark.asyncio
async def test_get_audio_success(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""Successful API call saves WAV and returns path."""
provider = _make_provider()
monkeypatch.setattr(
"astrbot.core.provider.sources.typecast_tts_source.get_astrbot_temp_path",
lambda: str(tmp_path),
)
fake_response = AsyncMock()
fake_response.status_code = 200
fake_response.headers = {"content-type": "audio/wav"}
fake_response.aiter_bytes = lambda: _async_iter([b"RIFF", b"fake_wav_data"])
```
To keep the tests consistent, you may want to:
1. Mirror the exact `get_audio` call signature used in `test_get_audio_empty_text` (e.g., language, voice, or any other parameters) instead of the minimal positional call shown here.
2. Optionally add a `match="..."` argument to `pytest.raises(ValueError, match="...")` if the existing empty-text test also checks the error message.
</issue_to_address>
### Comment 2
<location path="tests/test_typecast_tts_source.py" line_range="10-19" />
<code_context>
+from astrbot.core.provider.sources.typecast_tts_source import ProviderTypecastTTS
+
+
+def _make_provider(**overrides) -> ProviderTypecastTTS:
+ config = {
+ "id": "test-typecast",
+ "type": "typecast_tts",
+ "api_key": "test-api-key",
+ "typecast-voice-id": "tc_60e5426de8b95f1d3000d7b5",
+ "model": "ssfm-v30",
+ "language": "kor",
+ "typecast-emotion-preset": "normal",
+ "typecast-emotion-intensity": 1.0,
+ "typecast-volume": 100,
+ "typecast-pitch": 0,
+ "typecast-tempo": 1.0,
+ "timeout": 30,
+ }
+ config.update(overrides)
+ return ProviderTypecastTTS(provider_config=config, provider_settings={})
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests to verify that `timeout` and `proxy` config values are actually passed through to `AsyncClient`.
These tests cover behavior but don’t verify that `ProviderTypecastTTS` passes `timeout` and `proxy` into `AsyncClient`. Please add a parametrized test that sets custom `timeout`/`proxy` in `provider_config`, monkeypatches `AsyncClient`, and asserts that it’s constructed with those values. This also guards against regressions if the `AsyncClient` signature changes.
Suggested implementation:
```python
import os
from pathlib import Path
from unittest.mock import AsyncMock, MagicMock
import pytest
from astrbot.core.provider.sources.typecast_tts_source import ProviderTypecastTTS
def _make_provider(**overrides) -> ProviderTypecastTTS:
config = {
"id": "test-typecast",
"type": "typecast_tts",
"api_key": "test-api-key",
"typecast-voice-id": "tc_60e5426de8b95f1d3000d7b5",
"model": "ssfm-v30",
"language": "kor",
"typecast-emotion-preset": "normal",
"typecast-emotion-intensity": 1.0,
"typecast-volume": 100,
"typecast-pitch": 0,
"typecast-tempo": 1.0,
"timeout": 30,
}
config.update(overrides)
return ProviderTypecastTTS(provider_config=config, provider_settings={})
@pytest.mark.parametrize(
"timeout,proxy",
[
(5, None),
(10, "http://localhost:8080"),
],
)
@pytest.mark.asyncio
async def test_typecast_tts_passes_timeout_and_proxy_to_async_client(monkeypatch, timeout, proxy):
async_client_cls = MagicMock()
async_client_instance = AsyncMock()
async_client_cls.return_value = async_client_instance
monkeypatch.setattr(
"astrbot.core.provider.sources.typecast_tts_source.AsyncClient",
async_client_cls,
)
# Instantiate the provider with overridden timeout/proxy config
_ = _make_provider(timeout=timeout, proxy=proxy)
assert async_client_cls.call_count == 1
_, kwargs = async_client_cls.call_args
# Verify timeout is passed through
assert kwargs.get("timeout") == timeout
# Verify proxy is passed through (if set)
if proxy is not None:
assert kwargs.get("proxy") == proxy
else:
# When not explicitly set, proxy should either be absent or None
assert "proxy" not in kwargs or kwargs["proxy"] is None
```
1. This test assumes that `ProviderTypecastTTS` constructs an `AsyncClient` in its `__init__` or during instantiation. If the client is only created lazily (e.g., on a `synthesize`/`generate`/`text_to_speech` call), you should trigger that method in the test after `_make_provider(...)` so that `AsyncClient` is actually called.
2. If `ProviderTypecastTTS` passes `proxies` instead of `proxy` to `AsyncClient`, adjust the assertions to check for the `proxies` keyword instead.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7595b5c to
3f023be
Compare
|
Addressed all review feedback in the latest push:
12 tests, all passing. |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Consider moving
VALID_EMOTION_PRESETSto a module-level constant so it isn’t redefined on everyProviderTypecastTTSinstantiation and can be reused in other helpers/tests if needed. - The
2000character maximum and default numeric values (e.g., timeout, volume) are currently magic numbers; extracting them into named constants would make the limits and defaults clearer and easier to adjust.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider moving `VALID_EMOTION_PRESETS` to a module-level constant so it isn’t redefined on every `ProviderTypecastTTS` instantiation and can be reused in other helpers/tests if needed.
- The `2000` character maximum and default numeric values (e.g., timeout, volume) are currently magic numbers; extracting them into named constants would make the limits and defaults clearer and easier to adjust.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/typecast_tts_source.py" line_range="115-117" />
<code_context>
+ }
+ body = self._build_request_body(text)
+
+ async with AsyncClient(
+ timeout=self.timeout,
+ proxy=self.proxy if self.proxy else None,
+ ) as client, client.stream(
+ "POST",
</code_context>
<issue_to_address>
**issue (bug_risk):** The `proxy` argument to `AsyncClient` may not be supported depending on the httpx version and could lead to runtime errors.
`AsyncClient` usually takes `proxies=` (dict or URL) rather than `proxy=`. On versions where `proxy` isn’t supported, this will raise at runtime. Consider normalizing to a `proxies` dict (or using env vars) and passing it via `proxies=` for version compatibility.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Re: moving This is Typecast-specific and not reused elsewhere. Other TTS providers in this codebase ( |
Add Typecast TTS provider to support Typecast AI's text-to-speech API.
Modifications / 改动点
Added
astrbot/core/provider/sources/typecast_tts_source.py— new TTS provider that callsPOST https://api.typecast.ai/v1/text-to-speechSupports preset emotion (normal/happy/sad/angry/whisper/toneup/tonedown), volume, pitch, tempo, language settings
Input validation: required config fields (
api_key,voice_id), text length (1-2000 chars), emotion preset valuesRobust config parsing with
_safe_cast()helper (handles TypeError + ValueError)Follows existing TTS provider pattern (TTSProvider subclass +
@register_provider_adapterdecorator)No new dependencies introduced (uses existing
httpx)This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Add a new Typecast-based text-to-speech provider and cover it with validation and integration tests.
New Features:
Tests: